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Abstract. In this paper we develop the continuous averaging method of Treschev 
to work on the simultaneous Diophantine approximation and apply the result 
to give a new proof of the Nekhoroshev theorem. We obtain a sharp normal 
form theorem and an explicit estimate of the stability constants appearing in the 
Nekhoroshev theorem. 



1. Introduction 

In the papers |Trl4 ITV2] , Treschev developed a new averaging method called contin- 
uous averaging. It is a powerful tool to derive sharp constants in the exponentially 
small splitting problems in Hamiltonian systems with one and a half degrees of free- 
dom. But the technicality becomes very heavy when we use the method to study 
Hamiltonian systems of more degrees of freedom. For this reason, the method has 
not been applied to other problems yet. 

In this paper, we use the continuous averaging to give a new proof of the Nekhoroshev 
theorem. We consider the following analytic nearly integrable Hamiltonian system: 

(1.1) H{I,9,x,y) = H Q {I) + eH 1 {I,0,x,y), 

The phase space is 

(/, 9, x, y) G V := Q n x (M/2vrZ) n x W 2m C R n x (M/27rZ) rt x M 2m , n > 2, m > 0. 

We complexify the variables and extend the domain of (I, x, y) to a a neighborhood 
and that of 9 to a p neighborhood of the original domains respectively. The extended 
phase space to the complex domain is 

V(p,a) := {G n + a)x ((M/2vrZ) ri + p) x {W 2m + a) C C n x (C/2vrZ)™ x C 2m , 

where p is the width of analyticity in 8 and a is that of the slow variables /, x, y. 

As stated in (Nel EH |L2l EH ILNN} IFo] IBM] . Nekhoroshev Theorem ensures that 
when the unperturbed Hamiltonian Hq is quasi-convex, by which we mean that 
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the set {/| Hq(I) < E} is strictly convex, the following general estimate holds for 
sufficiently small e: 

(1.2) \\I(t) -/(0)|| < C e b , when \t\ < T = C x e c ^ £a . 

for some constants a, b, Co, C\, Ci > independent of e, where I(t) is the action vari- 



able component of any orbit associated to Hamiltonian (1.1) with initial condition 
in the set T>. 

There are many works studying the stability exponents a and b (c.f . [LN1 \Po\ IBMJ ) . 
Their approaches are based on a careful study of the geometric and number theoret- 
ical aspects of resonances. Instead, in this paper we try to sharpen the estimates in 
the analytic part of the proof using continuous averaging to obtain an improved nor- 



mal form (see Theorem 3.1 ). Then we apply the normal form to Lochak's argument 



to get the Nekhoroshev theorem (see Theorem 2.1) where all the stability constants 
are estimated explicitly. In this paper, we only work on the case a = b = l/2n. But 
the normal form theorem can be easily applied to other prescribed a and b to get 
the corresponding Gi. 

The method of Lochak is called the simultaneous Diophantine approximation, which 
turns out to be an important alternative to the classical approach via small divisor 
techniques, as explained in [L2j . Its main idea is to do the averaging in a vicinity 
of a periodic orbit. So it is essentially an averaging procedure for systems with 
one fast angle. In general, we can kill the dependence on the fast angle up to 
exponential smallness. This makes the simultaneous Diophantine approximation 
suitable to prove the Nekhoroshev theorem. The work [PT] can be considered as a 
development of the continuous averaging to the small divisor case. In this paper, it is 
the first time that the continuous averaging has been developed to the simultaneous 
Diophantine approximation. 

We point out the relation between continuous averaging and some important PDEs. 
The idea of the continuous averaging is to study the averaging procedure using PDE 
instead of iterations. The PDE has the form Hg = {H,F}, where F is the Hilbert 
transform of H in some special cases (see Section 3 for more details). This type of 
equation has been studied (c.f. |CCF| ) as a simplified model for quasi-geostrophic 
equation (c.f. |KNVj ) . incompressible Euler equation, etc. It would be interesting 
if we could apply some PDE techniques to our problem. 

To state our theorems, we need the following definitions. 

Definition 1.1. (1) We use | * |; | * 1 2 ? | * |oo to denote the ^2? ^oo 

norms for 

a vector in W 1 or IT 1 . Without causing confusion, we also use \ ■ \ to denote 
the absolute value of a function whose range is in~R or C. 
(2) For a function f(I,9,x,y), the weighted Fourier norm is defined as: 

= sup £ \f k \e^', p'<p, 
I ' x ' y fcez™ 
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if we have the Fourier expansion f(I,9,x,y) = X]fcez n f k {I , x,y)e l ( k,9 \ and 
the other variables (I, x, y) G {Q n + a) x (W 2m + a). 
(3) max{||fli|| p ,||Vi?i||p} := p. 

We also use the following definition to characterize the convexity of the unperturbed 
part Hq(I). 

Definition 1.2. Consider a Hamiltonian Hq(I) defined on G n + a. Then, we define 
the associated constants M± > to characterize the convexity of Hq{I). 

o<m>||< \{v 2 H (i)v,v}\, ^r\{o), ieg n . 

( '" >) \V 2 H (I)v\ 2 < M+\v\ 2 , v£C n \{0}, IeG n + a. 



Now we state a simplified version of our main theorem. The complete version is 
stated in the next section. 



Theorem 1.1. Consider a Hamiltonian system satisfying inequalities (1.3) 



m 



Definition \l.S\ and n > 2, m > 0. For every orbit (1, 9, x, y)(t) with initial condition 
(I, 9, x, y)(0) G T>(p, a) and (x(t),y(t)) G W 2m + a, we have the following estimates 
provided e is small enough. 



\Kt)-m\2< 8V ^I M+ e^\VH Q 



OO ■ 



for 



\t\ < , , — exp 



|V# |oo \\M + J Sjn^le 1 /^ 
where p = p\ + 2p 2 + P3 satisfies 




Pi 



Qp M 2 

3"l Vjtl °loo: 

n+1 



3e> 3 ( PlMl \ ^ 4 a 



n!|VF |oo \p 3 A^^TM + J 25 
The norm | • |oo is taken over I G Q n . 



This theorem gives the estimate of the stability constant C 2 in (1.2). For a given 
system, we need to optimize p\ under the constraints in the theorem. We see that 
the decomposition of p can be qualitatively written as p = p\ + co/U 1 /™ + cip\ +1 ^ n , 
where the constants Co = co(M±, |V-ffo|oo) and c\ = ci(M±, |ViTo|ooj n, a). Though 
not solved explicitly, we expect our estimate here improves the previous results 
|LNNl IN] since the continuous averaging method gives us an improved normal form 



theorem (see Theorem 3.1). 
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A possible application of the result is to the 3-body problem in order to get long 
time stabilities. This direction is already pioneered in (Nj. But the mass ratio of 
Jupiter to the sun obtained in [Nj is too small to be satisfactory. On the other 
hand, in [FGKRJ , the authors construct diffusing orbits for restricted planar 3-body 
problem. The diffusion time there is polynomial w.r.t. 1/e. 

The paper is organized as follows. First we give a complete statement of the main 
theorem and compare it with previous results in Section [2] Then we state a normal 
form Theorem 3.1 about averaging in a vicinity of periodic orbits in Section [3} This 



is the main result that we obtain using continuous averaging, which improves the 
corresponding one in |LNN] IN] . Then we give a brief introduction to the continu- 
ous averaging method in Section |4j After that we give a proof of Theorem 3.1 



m 



Section 
Section 



This section is a higher dimensional generalization of the case studied in 
We try to draw analogy between the two sections. With the normal form 
theorem, we first show local stability result of Nekhoroshev theorem in Section |6j 
and then global stability in Section [7| Here local stability means the stability result 
in a neighborhood of a periodic orbit and global stability means stability for all 
initial conditions. Finally, we have two appendices [A] and [Bj The first one contains 
some technical estimates for the continuous averaging. The second one is some 
basics of majorant estimates. 



2. The complete statement of the main theorem and discussions 
We give a complete statement of the main theorem as follows. 
Theorem 2.1. Under the same assumption as Theorem \l.l\ we have 

\m-m\2< 8V ^ M+ e 1/2n \vHou 



for \t\ <T = , , — exp 



1 ( fM-\ 2 pi 



where p = pi + 2p2 + P3, provided the following restrictions are satisfied. 



6u Mi, 
P2 ~ M 3 



2 

oo > 



l/2n^ ■ /V^T|W/o|2 M 2 _ 
£ L/An < mm i 1 —{P2 + P3) 



5npM„ ' °' ' 4v / r^T|V 3 #o|oo 

M 2 _ ( a pi hp\ 



8Vn^lM + \5(^n + 2^)|VF |oo 2(n + 2m)p 3 2ap 3 



)}■ 
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3e a p 3 ( piM 2 \ n+1 4a 



< 



n!|V# |oc \p^^n~^lM + ) 25' 

£l/2 „ 16 nCTf + ( b rfAtf.-"" \ < L 
piM^ I 32n(n- l)M% Pz J ~ 

The norm | • |oo is taken over I £ Q n . 

The constant p plays the same role as the constant E in [LNNJ |N] . It is dual to e 
since only the product ey/ enters the original Hamiltonian. We need the smallness 



of p to make the first bullet point in Theorem 2.1 satisfied. The same restriction 
is expressed in [LNN, NJ by introducing a constant g. The second bullet point can 
be satisfied easily by taking e small enough. To improve the stability time, we 
want p\ to be as large as possible, but the third bullet point gives a restriction of 
pi so that we need to optimize among pi,p2,Ps- This restriction appears due to 
the finiteness of the width of analyticity of the action variables I and degenerate 
variables x, y. It shows up in a different form in [LNNJ as item (ii) of Theorem 2.1, 
where the choice of R there can be as small as e 1//2n . We will give more discussions 



in Remark 6.1 and 7.1 We will see from the following Theorem 3.1 that our normal 
form theorem obtained from continuous averaging improves that obtained from the 
iteration method. Therefore we see we also get improved C2 here even though p\ is 
not expressed explicitly. 

3. Normal form 

Our main work in this paper is to obtain a normal form theorem using continuous 
averaging. Following Lochak, we do the averaging in a neighborhood of a periodic 
orbit. 

Definition 3.1. We define 00* = (pi,P2,--- ,Pn)/T, Pi E Z, f € R \ {0}, and 
g.c.d.(px 5 P2) ■ ■ ■ ,Pn) = 1- This is the frequency vector of a periodic orbit of the 
unperturbed Hamiltonian Hq. The period T of this vector is, T = 2ttT. 

Integer vectors k with (u* ,k)^0 give us 

(3.1) \(k,cj*)\ = \k-( Pl ,p 2 ,--- , Pn )\/f>l/f 

After a proper translation in the space of action variables, wlog, we assume 

-(0) :=^(0)=uA 
We can split the Hamiltonian ( |1.1| ) into four parts 

(3.2) H(I, 6, x, y) = (oj*,I) + G(I) + eH(I, 9, x, y) + sH(I, 6, x, y). 
where each of the terms is given in the next definition. 
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Definition 3.2. We use the Taylor expansion of Hq to split it as 

H (I) = (cu*,I)+G(I), 

where G(I) contains the higher order terms. For H\ part, we use the Fourier ex- 
pansion Hi = H k (I,x,y)e l ( k ' e ' to write 

eHi (1, 8, x, y) = eH(I, 9, x, y) + eH(I, 9, x, y), 

where 

eH := e H k e %lyk,e \ the resonant part, 

{k,u*)=0 

eH := e ^2 H k e l( - k ' e \ the nonresonant part. 

The Hamiltonian equations can be written as 

OH OH 

I = -e—{I,0,x,y) -e—{I,9,x,y), 

OH OH 
(3.3) 9 = uj* + VG(J) + e— (I, 0, x, y) + e— (I, 9, x, y), 

8Hi dHx 

x = -e— — , y = e— — . 
ay ox 

Theorem 3.1. Suppose \I\21 \%\2, 1 2/ 1 2 < TZ, (I, 8, x, y) 6 TJ(p, a) for some 1Z, p, a > 
0. Then there exists Eq > 0, such that for any < e < Eq, there exist pi,P2,P3 > 
such that p\ + 2p2 + pz = p and a symplectic change of variables, (I, 9, x, y) — > 
(I',9',x',y'), for \I'\2 < TZ, (I', 9',x',y') £ V(p2,4a/b), which sends the Hamilton- 
ian ( |3.2[ ) to the following normal form: 

H = (co*, I') + G(I') + e*(7', 9',x', y') + ett(j', 9', x' , y'). 



with the nonresonant part [see Definition 3.2) 
^(I',9',x,y) 
5/i 



5p ( 2irpi 
< 3T ex P 



P2 ~ p% "V M+KT 



the resonant part \\^>\\^ < — , and the change of variables 
I (/', 9', x', y>) - (1, 0, x, y)U < 5g/ f 

2ix{p 2 + P3) n 

The £0, 7Z, K = — - -J^=r=. and px,p2,p3 satisfy the following restrictions, 
p 3 M + TZT 

• b(s/n + 2^m)TZ < a, 

• 2(n + 2m)<K, 2a/(5p 1 )<K, 
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3 (2K) n - 1 K 2 p 3 ( - ( 2K 2 p 3 \\ 4o- 
-e a euT± '— — 1 + ■% 1 + In — < — . 



Moreover, p\ can be arbitrarily close to p if e is sufficiently small (see Remark 6.1). 



The exponential smallness obtained here improves that of |LNN| IN] . Continuous 
averaging enables us to get rid of some extraneous numerical factors that worsen the 
estimates. Moreover, our method has an advantage, that is we do not need to do a 
preliminary transform which is necessary in |LNN| ILNj . The proof of this result is 
contained in Section [5j 



4. A BRIEF INTRODUCTION TO THE CONTINUOUS AVERAGING 

In this section, we give an introduction to the continuous averaging method. Please 
see the chapter 5 of |TZj for more details. We try to explain the key points of the 
method that will be used in our later proof. 



4.1. Derivation of the continuous averaging equation. We write the Hamil- 
tonian (1.1) as H(z), z = (I,9,x,y). Suppose we have a symplectic change of 
z(Z(5),5) depending on parameter 5, where Z(§) denotes the new 



variables z 
variables. Then we have 



H(z) = H(z(Z, 8)) := H(Z, 6). 



(4.1) 



dH dZ dH 
~dZd5 + ~d5 



0. 



dZ 



If we choose the flow of — to be the Hamiltonian flow generated by a Hamiltonian 

do 

isotopy F(Z, 5), i.e. 



(4.2) 

with initial value Z\ 
(4.3) 



dZ 
~dS 



JdF(Z,6), J 




Id 



-Id 




5=0 = z, then the change of variables is symplectic and we get 
H S = -{H, F} Z = -{H,F} Z , 



where the subscript 5 means partial derivative. The last equality follows from the 
fact that the Poisson bracket is invariant under symplectic transformations. In the 
following, we only work with the variables z. 

To simplify our discussion, we consider a special case of (1.1) with m = n = 1. 
A further simplification is to consider only time-periodic nonautonomous systems. 
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This is equivalent to requiring that Hq(I) = I in equation (1.1) and H\{x,y,9) 
independent of /. From equation (4.3), we have: 

(4.4) H s = -F 6 -{H,F} iXtV) , 

where {•, -}( x ,y) stands for the x, y part of the Poisson bracket. 
Our goal is to show that 

if we choose a suitable Hamiltonian isotopy F and extend 5 as large as possible, the 
dependence of H on can be killed to be exponentially small, i.e. 0(e~ c l £ ) for some 
constant c. 

Suppose H(z, S) has Fourier expansion 

(4.5) H(z,5) = I + e{H 1 ) + e H k (x,y,5)e ike , 

fcez\{o} 

where e(H\) means the zeroth Fourier coefficient of H\. 

We choose the Hamiltonian isotopy F as the "Hilbert transform": 

(4.6) F(z,5) = - *eo- k H k (x,y,5)e ik9 , a k = S gn(k). 

fcez\{o} 



Now equation (4.4) has the form in terms of Fourier coefficients: 

(4.7) H$ = -\k\H k + H k } M + ie £ a m {H l , H m } [x>y) , k e Z\{0}. 

l+m=k 

We show this F is the good choice that makes the dependence on 9 decrease expo- 
nentially. 

4.2. The choice of the Hamiltonian isotopy F. 

4.2.1. Heuristic argument. Following [TZ], we explain here the heuristic ideas that 
justify this choice of F. If we set e = in (4.7), we get 

H k = -\k\H k , 

whose solution tends to zero as 5 — > oo. If we neglect the third term in the RHS 
of (4.7), we have 

H§ = -\k\H k -ie<T k iH k ,{H 1 )} iXtV) . 
It has an exact solution of the form 

(4.8) H k (I, x, y, 5) = e'^ s H k (I, x, y, 0) o g~^ s , 

where g means the Hamiltonian flow generated by the Hamiltonian (Hi). Notice 
the imaginary unit % here. It tells us that the flow is considered with purely imag- 
inary time. As S increases, the complex width of analyticity is lost gradually. So 
formula (4.8) has sense only if we take e5 < p, where p is the width of analyticity 
in 9. This is an obstacle for the extendability of 5. 
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We see from the heuristic argument that this choice of F gives us the exponential 
decay as well as a good guess for the stopping time. 



4.2.2. Comparison with the Lie method. The Lie method is used in the works (NJ 
ILNt ILNN] . Before working out the detailed proof of the above heuristic argument, 
we explain the "Hubert transform" first. In fact this choice of F is strongly related 
to the classical averaging theory. Let us recall what we usually do in the Lie method. 

Define the linear operator of taking Lie derivative along the Hamiltonian flow gen- 
erated by the Hamiltonian function F: CpH = {H,F}. 



The time-1 map of (4.1) and (4.3) is: 



H 



5=1 



e cp H 



1 



= H + {H,F} + -{{H,F},F} + ... 
= H + eHi + {H , F} + h.o.t. 
In each step of iteration, we need to solve the cohomological equation 

(4.9) eHr + iH^F} = 0. 
In fact, we are only able to solve 

(4.10) eH 1 -e(H 1 ) + {H ,F} = 0. 

By comparing the Fourier coefficients, we obtain the following 

.eH k (I) 



(4.11) 



eH k {I) + ikF k = 0^F k 



k 



k t^O. 



Now we can explain why we choose F as the Hilbert transform of H in (4.6). We 



select F to inherit the most important information in F, namely, the imaginary unit 
i and sgn(fc). Readers can check that we still get t he heuristic argument a bove if we 
choose the F whose Fourier coefficients are (4.11) to do the averaging in (4.3). 



4.3. The int egr al equation. Now we take into account the third t erm in the RHS 
of equation (4.7). We first remove the — \k\H k term in equation (4.7) by setting 
FL k = e~\ k \ s u k to obtain 

(4.12) u k 



l+m=k, 
m<0<l 



If we define an operator g ts * f := / o g ts , where g l is the flow generated by the 
Hamiltonian —(Hi), the exact solution of the truncated equation 

u k = -iea k {u k , (Hi)}^ 
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would be g er7klS *u k (x,y : 0). 

Then using the variation of parameter method in ODE, we can write the exact 

solution to equation (4.12) in the following form. 

(4.13) 

u k {x,y,5) 

r><5 



g £ ^*u«(x,y,0) + 2iea m \ e -w\+\™\-\k\)s g ea k i(6-s)* £ { u \u m }^ y) ds 



l+m=k, 
m<0<l 



+ 2iea 1 



6 

E « 



(|i|+|m|-|fc|). { ^i(i-.).„i n e* k i(6-s)*„m 



u ,gr 



o 



^ m }(x, y ) ds. 



l+m=k, 
m<0<l 



We will analyze this equation to study its solution. To do so, we need a good control 
of the non- homogeneous term, i.e. the second term in the RHS. 



4.4. Control of the nonhomogeneous term of equation (4.13). To control 
the nonhomogeneous term, we use the majorant estimate. The majorant relation 
" <C " is defined as follows. 



Definition 4.1. For any two functions f(z), g(z), z = (z%,Z2, 
the point z = 0, 

P 



j z n 



analytic at 



We say that g is a majorant for f (/ <C g) if for any multi-index f3, we have 
93>\fal 



The proof is first to guess a majorant assumption, then show the function in the 
assumption satisfies an equation that majorates the integral equation (4.13). This 
checks the assumption and closes up the argument. 
Now we make a majorant assumption 



(4.14) 



g £akiT *u k (x,y,5) <. ne- wp V(Y,5), Y = x + y, \t\<6*, 



where S* ~ p/e is the maximal extension time determined by the homogeneous part 
of equation (4.7) in the heuristic argument. The e~' fc ' p characterizes the way how the 
Fourier coefficients decay in the case of analytic perturbation and p = ||iJi|L. We 

dV dV dV 

choose Y = x+y to make it easier to calculate the derivatives since — — = — — = — -. 

ox oy oY 
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Then the integrand of equation (4.13) can be majorated by C(s)(Vy) 2 , where 
C(s) = 4e/x 



£ 

Z+m=fc, 
m<0<l 



\+\m\~\k\)(s+p) < + ._ c 



This C depends on the smoothness and magnitude of Hi and the number of combi- 
nations l + m = k. The number of combinations of integers in one dimensional case 
is easy to estimate, but in higher dimensional case it becomes very difficult, which 
is the main difficulty that we need to overcome in this paper. 

If we can solve the equation V$ = CVy, then equation (4.13) can be viewed as 



(4.15) 



lk K k < V(0) + V s ds = V(0) + V(S) - V(0) = V(5). 



This checks the majorant assumption (4.14). In order to solve the equation Vs = 

d 

C(Vy) 2 , we apply the operator to the equation. We get the following Burgers 
equation by setting U = Vy. 



U 5 = 2CUU Y , U(Y,0) 



a 



Y 



Here U(Y, 0) majorates Vu in the sense Vu k (x,y,0) < Me~^ p U(Y, 0). The initial 



condition 



a 



is due to Lemma 



o-Y 

analyticity in the slow variables (x, yj 



B.l 



in Appendix 



B 



where a is the width of 



4.5. Outcome of the continuous averaging procedure. The Burgers equation 
can be solved explicitly using the characteristics method in PDE. The solution is 

(4.16) U(Y,5) la 



a-Y) + ^(a -Y) 2 -8aC5' 

In order to ensure (a — Y) 2 — 8aC5 > 0, we obtain the maximal flow time given by 

the slow variables is 5 < ——. In fact C = 0(e), so combined with 5 < 5* ~ p/e we 

get the maximal flow time is 0(l/e). We also notice U is always bounded provided 
\Y\ is sufficiently small, so is V. Recall that we defined 

Therefore each Fourier coefficient H k after the continuous averaging would be less 
than e _<5 = 0(e~ c ^ £ ) for some constant c. Adding up all these Fourier terms, we 
recover the Hamiltonian after the averaging, which is of order 0(e~ c / £ ). This is the 



result proved in [Trl|, \Tr2\ ITZj . We will work out all the details in Section 5.5 
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5. Continuous averaging proof of the Normal form Theorem 13.11 



Now we prove Theorem |3 . 1 1 using the continuous averaging method. Let us go back 
to the setup in Section |3| Since we are looking at a motion that is very close to 
periodic orbit in the region of the phase space, the continuous averaging explained 
in the previous secti on c ould be applied. The periodic orbit corresponds to the fast 
angle in equation (4.5). The nonresonant part H corresponds to the 9 dependence 
term Ylk^o H k e lke in equation (4.5). The (oj*, I) will produce the exponential decay 
in the same way as the term / in equation (4.5) did in equation (4.8). And the 



term G(I) will generate the imaginary flow in the same way as the term (-Hi) 
in equation (4.5) did in equation (4.8). Finally, the term H leads to additional 
difficulties. 



We devote the remaining part of this section to the proof of Theorem 3.1 The proof 
is organized as follows. 

• Set up the continuous averaging in terms of Hamiltonian and get some heuris- 



tic understanding of the averaging process in Section 5.1 



Apply it the Hamiltonian vector field in Section 5.2 



Following procedures in Section |4j we define the operator g to write the 
differential equations as integral equations, then we write down the majorant 
equation and prove the majorant relations. 

Derive necessary estimates in the theorem from the majorant estimates in 
Section 15.51 



5.1. Continuous averaging for Hamiltonian (3.2). In this section, we write 



down the continuous averaging and get a heuristic understanding. We start with a 
definition. As we have seen in Section |4j in the process of continuous averaging, we 
have different aspects like exponential decay, imaginary flow and nonhomogeneous 
terms. 

Definition 5.1. We define a partition of the width of analyticity p, 

p = P! + 2p 2 + P3, pi,p 2 ,/93>0, 

and put 



K 



Pi 

psM+nf 

For 5 > 0, we also define the following sets to form a partition of the grid Z r ' 
D- (5) = {I- 6 Z n | (l^,oo*) <0, \l-\p3 + \(l-,u*)\S<p3K}, 
D+(8) = {1+ G 1 n \ (l+,u*)>0, \l + \ P3 + \(l + ,u*}\5<p 3 K}, 
D = {l eZ n \ (1 ,cj*) = 0}, 
D>(6) = Z n \ (D-(5) U D + (5) U D ). 
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^2 












o 









Figure 1. Definition of sets when 5 = 0. The diamond encloses 
integer vectors \k\ < K. uj is the frequency vector uj*. The red line 
is a hyperplane perpendicular to uj. The region outside the diamond 
is D > (5). The red line splits the region inside the diamond into 



Finally, we define two functions of 5 associated to the above sets. 

a k (5) = < 

Sk(S) = / cr k (s) ds. 



1 k G D+(S), 
-1 kD-(j), 
k ke D UD > (5). 



Remark 5.1. • We split the analyticity width p of the fast angle 6 into p = 

pi + 2p2 + P3- This splitting is quite flexible. We will optimize it to make p\ 
as large as possible in Section^ and^ Here p\ would be used to control the 
imaginary flow, P3 is used to do averaging, and p2 is the remaining width of 
analyticity in angular variables after averaging. These distinctions will be 
made clear in the course of the proof. 
• We choose the cut-off K to make sure that if\k\ > K, then the corresponding 
Fourier coefficient is smaller than e~ p:iK , which we think to be sufficiently 
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small. A Fourier coefficient with k E D±(5) will become smaller as 5 in- 
creases. Once it is smaller than e~ p:iK , the vector k enters D > (5). So D±(5) 
keeps shrinking as 5 increases. We stop running the continuous averaging 
once D± = 0. 



We also define 



5* :=sup{<5| D±{5)^%} 



as the stopping time. 

Lemma 5.1. The stopping time 5* satisfies 

5* < KTp 3 . 



Proof. From the definition of D±(5), we know that at time 5*, we should have 



\l + \p 3 + \(l + ,ou*)\6* = p 3 K. 



We know > 1/T from equation (3.1). This implies 5* < KTp 3 . □ 



Now let us build our continuous averaging. This part is analogous to Section 4.1 



From equation (4.3), we have 



(5.1) H S = {F, H} = {F, (oj*,I)} + {F,G} + {F,eH} + {f,sh} . 
Lemma 5.2. If we define 



(5.2) 



F:= iea k (5)H k (I,x,y,5)e l ^ e 



in the continuous averaging equation (5.1), where o"fc(<5) is defined in Definition 5.1, 
then depending on the properties of the Fourier mode k, we have the following three 
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groups of PDEs. 
(5.3a) 

For k G D , 



H k e i(k,e) = ^F j£ ff\ k = ^ ia l± {H l± e i{l± ' e) ,eH l e i{l ' e) } , leZ n \D . 

l±+l=k 



(5.3b) 

Fork e D^(5)UD + {5), 

H k e i(k,e) = -\^^ k )\H k e l{k ' e) + ia k {H k e i{k ' e) ,g} + {F,eS} k + {f,^}^ 

= -\(u*,k}\H k e^ + ia k {#V^>,Gj + ]T ia i± {#<± e ^±> e >, £j hV<^> } . 

i ± +i=k 

(5.3c) 
For A; G L>>(<f), 

^e^ = {F,^} fc + {F,^} fc = i £ ea^H^M, H l e^} . 

i ± +i=k 

We have I ^ in the latter two cases. 

Proof. From the definition of F we know the Fourier harmonics of F come only from 
D±{5). As a result for any k ^ 0, we must write k = l± + I for l± £ Fa|-(o~) and 
some i. The equation ( |5.3b ) is straightforward. 

If k e D , then (k,oo*) _= 0. We know (l±,u*) / 0, then ^ 0. So in 

equation (5.3i), the _£fo> -ff terms do not appear. If k £ D > (5), no Fourier harmonics 
from {F, (I, w*)} + {F,G} in equation (|5.1|) enter equation (5.3:). □ 



Following directly from Definition 1.2 we have the lemma. 

Lemma 5.3. If we define 1Z as the confinement radius of I, i.e. \I\2 < TZ, I E 
G n + a C C n , then we have the following estimates 

\G\ < M + K 2 /2, \VG\ 2 < M + K. 



Proof. We first notice G(0) = VG(0) = 0. For \G\, we use the formula 

G(I) = (i J (1 - t)V 2 H (tI) dt,l\ 

and Definition 1.2 to get the estimate in the lemma. For |VG(I)|2, we use 

VG(J) = I f V 2 G(tI) dt = I [ V 2 H (tI) dt. 
Jo Jo 
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□ 

The following lemma helps us to understand the heuristic ideas of the process of 



continuous averaging and Definition 5.1 



Lemma 5.4. If we omit the Ylk=i±+i terms in the RHS of equations (5.3), then 
equation (5.3) can be solved explicitly and the solution satisfies 

\H k (I,x,y,5)\ < pe~ pW , for k G Z>>(0)U A), 
\H k {I,x,y,5)\ < Me -2p2|fe| e -P3|fc|-<5|(a J *,fc>| ) for k G D± ( ) 
Moreover, at the stopping time 5* we have 

\H k (I,x,y,5*)\ < fie- 2p2W e- p3K , for k G £>±(0), 
where the domain of variables is (I,x,y) G {Q n + a) x (W 2m + <r). 



Proof. It follows from Definition 1.1 that < /ie p ' fc ' for 

G (G n + a) x (W 2m +cr). 

If we truncate equations (5.3), then the first and the third become H k = for 
k G D > (0) U Dq. So we have the corresponding estimate of \H k \ stated in the 
lemma. However, equation (5.3c) becomes 

H k el (k, S ) = -\(^*^ k )\H k e i{ - k ^ +ia k {H k e^ k ' e >,G} = (-\(lu* ,k)\ - a k (k,VG))H k e l{k ' e) . 

This equation admits an explicit solution 

H k (I,x,y,5) = e-^' k)l5 - ak{k ^ G)5 H k (I,x,y,0). 

So we have the estimate 

\H k \ < Aie -p|fc| e -|(w'.fe)l«-^(fc.VG)* j k G D± (§). 

Using Lemma |5.3[ we get 

\a k (k,VG)\ < \VG\oo ■ \k\ < |VG| 2 • \k\ < M + lZ\k\. 

In the splitting p = p\ + 2pi + p%, we use pi to bound the term (k,VG). Namely, 
we need 

W k (k,VG)\5 < Pl \k\. 

It is enough to require that 

(5.4) 5M+K < Pl . 

This also gives an upper bound for 5. We equate this upper bound with the one 
given in Lemma 5.1 to obtain the value of K in Definition |5.1[ Now we have 

\H k \ < n e -2p2\k\ e -P3\k\-\(^,k)\5^ f. G D± ^y 

The definition of D±(5) implies that once this H k term is already e~ 2p2 ^e~ psK , the 
k will enter D > (5) and not belong to D±(5) any more. □ 
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5.2. Continuous averaging for a vector field. In order for the majorant esti- 
mates to be applicable to understand equations (5.3), we need to write the contin- 
uous averaging equations in terms of Hamiltonian vector field. 

Definition 5.2. We introduce the following vector fields h*, h p, h, h corresponding 
to different parts (I,U)*), G, H, H of the Hamiltonian (3.2) and f corresponding 
to F. 

h* = (0,u;*,0,0), ho = (0, VG,0,0), 
(5.5) _ _ _ 

h= JVH, h = JVH, f = JVF. 

We also use h k to denote the k-th Fourier coefficients ofh and h. Moreover, corre- 
sponding to F in equation (5.2), we define 

f = Y,ieo k {5)h k e^. 

k 



With this definition, we can rewrite the continuous averaging equation (5.1) as 
follows by replacing the Poisson bracket by Lie bracket and the upper case letters 
H, F by the lower case letters h, f respectively. 

h s = [f,u* + ho(I) + eh + eh) 

Lemma 5.5. // we set v k = h k e Sk ( s ^ w *' k ' (recall Sfc(<5) was defined in Defini- 
tion 5.1), then equations (5.3) can be rewritten in the following form in terms of 



Hamiltonian vector field. 
(5.6a) For k £ D , 

v k = e~^he £ a l± 
l±+l=k 

(5.6b) For k G D_ (S) U D + (5) , 



-((uj*,l±)S l± + (uj*,l)Si) 



v k e^ e \v 



l ± +l=k 

(5.6c) ForkeD>(5), 

l ± +l=k 



v l± e i«±,0) J e id,6 



-((w*,l±)Si,+(u*,l)S l -(aj*,k)S k ) 



v k s = ie~^ 



v i± e i(i±fi) i e i(i,< 



-((u)*,l±)S l± +(u*,l)Si-("*,k)Sk) 



Proof. In equations (5.3), we replace the Poisson bracket by Lie bracket and the up- 
per case letters H, F by the lower case letters h, f re spec tively. Then we remove the 
— \{u)*, k)\h k in the second case as we did in Section 
Then direct computation proves the lemma. 



4.3 



We set v 



□ 
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5.3. The operator g and the majorant commutator. What we do next is to 
write the differential equations for v k, s as integral equations. As we did in Sec- 



tion 4.3, we first need to define an operator g which solves the homogeneous part of 



equations (5.6), esp. (|5.6b). 



5.3.1. The operator g. 

Definition 5.3 (Section 2 of |PT]). Let g l be the Hamiltonian flow of the Hamilton- 
ian vector field ho(I) generated by the Hamiltonian G(I). We put fj- = f(I, x, y)e i ( k ' S ' 
for an arbitrary analytic function f defined on T>(p, a) and then define: 

Elf = e- i ^g- it (f k og it ), t € R. 



It is shown in Section 5 and 7 of [PTj that g has the following two properties. 



(1,5) = g S k k{5) v k (I,0) solves t#(I,$) = ia k (5)e- l ^[v k e 



i(k,9) r k i(k,6 



gU 2 (e-^ k ^nhe^ 9 \ he^ 



-i(k 1 +k2,e)ri(k 1 ,6) t 1 i(k 2 ,e) t 



With this operator g, we can write differential equations (5.6) as integral equations 



Lemma 5.6. If we denote the Y2k=l±+l terms in equations (5.62 



5.6) 



5.6 



:) by 



Va, f]b Vc respectively, then we have the following three integral equations equivalent 



to equations (5.6). 

(5.7a) For k £ D , v k (I, 5) = v k (1, 0) + e / (e~ l{k ' e) rj k ) ds. 

Jo 



(5.7b) For k G D- U D+, v k {I, 5) = g£ fe V(I, 0) + e I g a k 

Jo 

(5.7c) For k G £>>(<?), v k (I,5) = v k (I,0)+e / (e~ l{k ' e) r] k ) ds. 

Jo 



cr fe (<5— s) / -i(k,0) „k 



rib) ds. 



Proof. The equations ( |5.7ft ) and (5.7;) are straightforward. The equation ( |5.7b ) is 
an application of the first property of the operator g above and the variation of 
parameter method in ODE. □ 



5.3.2. The majorant commutator. We need the following majorant commutator to 
perform estimates. 
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Definition 5.4 (Section 7 of [PT]). For any two functions F, G : C n+2m -»■ C, and 
any two vectors l,k S 7L n , we define the majorant commutator: 

[[F, G]f> k = (\l\ + \k\)FG + (n + 2m)^(FG), 
where Y = I + x + y. 

For this commutator, we have the following lemmas. 

Lemma 5.7 (Proposition 7.1 of [PT]). Suppose that /i,/ 2 : C n+2m -> <C 2 ( n+m ) ; 
F x , F 2 : C n+2m -> C, and /i < Fi, / 2 < F 2 . #ere i/ie majorant relation fi < F 4 
means that Fj majorates each component of the vector fi, i = 1,2. TTien /or any 

e -i(fc 1 +fc 3 ,fl>^ ie i(fci,0>^ 2e t(fc 3l fl>] < [[Fi, F 2 ]] fcl,fc2 . 



Lemma 5.8 (Proposition 7.2 of [PT]). Suppose that g T k Ji{y) < Fl(F), g£ 2 / 2 (y) < 
F 2 (F). T/ien 

g£ 1+fca (e-^+^^Lfie*^^,^^'^]) « [[A,F 2 ]] fel ' fc2 . 



5.4. Majorant equation, the derivation and the solution. 

5.4.1. Majorant control on the initial value. We first have majorant control on the 
initial value. 

Lemma 5.9. For |<5| < o~* ; and 1Z < o~, k € Z n ; we Ziaue 

u fc (/,x,n,0)«^ — , g d v k (I,x,y,0) < ^ . 

<7 — y a — y 



Proof. We first consider v k (I,x,y,0) = h k . We know |/i fc |oo < p ' fc ' for 



(I,x,y) G (g 
use Lemma 



+ a) x (VV + o~) from the definition of /i in Definition 1.1 Then we 



B.l 



(4) in Appendix [B| to obtain the majorant control of v (I, x,y,0). 



Now we consider the effect of g. The operator g is defined by the Hamiltonian 
flow generated by the Hamiltonian iG(I) in Definition 5.3. The variables I, x, y 
are constants of motion of this Hamiltonian flow. So g only shrinks the width of 
analyticity in but has no influence on that of /, x, y. From the definition of g, 

we see 



IgVi 



I,x,y,i 



< e 



|(fc,VG>|5 



v k (I,x,y,0) 



We also have 



\{k,VG)\5 < M+U5*\k\ < Pl \k\ 
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according to inequality (5.4). This tells us 



^v k (I,x,y,0)\ oo < f ie-^ pa+ ^ k \ for (I,x,y) G {Q n + a) x (W 2m + a). 



Now use Lemma B.l (4) in Appendix |B| again to obtain the lemma. 

□ 

5.4.2. Majorant equations. The following construction is given in [PTJ. 

Definition 5.5. Consider a continuous function a(5). We define the functions W 
and as follows as solutions of PDEs. 

W S = a(5)WW Y , W\ s=0 = — ly, 

(5.8) TV |fc| (<5) = W(S), for \k\ < K, 



W 



\k\ 



(5) (WWW + ^WW\A , W\% =Q = ^y, for \k\ > K. 



Lemma 5.10. The solutions W and are given explicitly by 

W = 2 

(<r-Y) + a/(ct — Y) 2 — AA(5) ' 

(5.9) W ^=We WB ^, for\k\>K, 

5 



A(5) = [ a(s) ds. 
Jo 



The solutions are defined up to time 5* and for Y satisfying the restrictions. 
(5.10) A(5*) < (a- {^ + 2^)K) 2 /4, \Y\ < (y 7 ^ + 2^n)K. 



Proof. The fact that W and are exact solutions can be checked directly. To 
obtain the restriction for 5*, we need to ensure (a — Y) 2 — AA{5) > so that the 
square root makes sense. 

We want that when 5 = 5*, we still have \I\2, \%\2, \y\2 < We know 

1^1 < \i\ + \ x \ + \y\ < V^l-^k + v^l^b + v^lyb < {\fn + 2\/rn)TZ. 

□ 



Remark 5.2. Let us try to understand the PDEs (5.8) heuristically. Consider 
(5.11) U t = WU X + VU, U(x, 0) = U (x). 

The way to solve it is the characteristic method. The characteristics is given by 
dx 

— = —W. Then we are able to write the PDE in the form: dU/dt = VU. Then 
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U = Uoef Vdt . So we see that, W determines how fast we approach the intersection 
of characteristics, while V determines how U grows. 



5.4.3. Proof of the majorant relation: W, majorate the solutions of equation (5.1). 
The main result of this section is summarized in the following proposition, which 
implies the solutions of equations (5.8) majorate that of equation (5.7). 

Proposition 5.11. For any r such that \t\+5 < 5* } we have the following majorant 
control of the solution v k (I, x, y, 5) of equation (15/71) 



g T k v k (I,x,y,5) < afie-^ i+2p2) W W (Y,8), for k G Z n \ {0}, 



and 



v (I,x,y,5) < a fie -^1*1^1*1(^5), for k e D UD>(0). 

under the restriction 5.t(\ coming from Lemma |5.10[ (The expression of a(5) and 
A(5) will be given explicitly in Lemma 5.13.) Moreover A(5*) is given by 

X2K) n ~ 1 K 2 p 3 



(5.12) 



A(5*) = 6e a aepT- 



1 + 



2n 



1 + ln 



2K^p 3 



n 



Proof. We first cite Proposition A.l in |PT| . 

Lemma 5.12 ([PTJ). Consider the functions W,W^ defined in Definition 
the following statements are true: 

(1) l/(o- -Y) < W <^W Y , 

(2) W < W\ k \, 

(3) WyWW < WWf, 

(4) <C We^ 1 , 

(5) W\ y \ < tyl*l e <KI*'H*l)/^ \k\ < \k'\. 



5.5, then 



Let us first divide equati ons (5.7i), ( |5.7) b) , ( |5.7fc ) by the numerators of the ini- 
tial condition in Lemma [^J i.e. ape~ p ^, a pe~^ p3+2p2 ^ and afie~ p ^ 
tively. Then we use the expression apC^ to refer to any one of ee p ' fc 'e" 
ee |fc|(P3+2p 2 )g<5-s ^ e -«(fc,e> r? fc) Qr ee p\k\ e -i(kfi)^k^ ^ see tne i n t e g ra nds of equations (|5.7) 

for the definitions of rj a , r]b, r] c ). 

To carry out the proof, we substitute the majorant relation in Proposition 5.11 into 



rcspcc- 

■i{k,6) n k 

iZa,' 



equations (5.7) to check that equations (5.7) are maj orated by equations (5.8). This 



is the plan proposed in [PTJ- We use the m ajorant commutator to majorate each 
of the Lie bracket of £ fc according to Lemma 



5.7 



5.8 



C < eap 



£ ■ 

i ± +i=k 



-(K±l+l'|-|fe|)fe+2p2) e -(5 ;± <^%«±>+5 ; <^%0-S fc (a;*,fc>)^^ H/ |i|^ ± ,/^ 
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Here we also use that e-^WH*!) < e -(l'±l+l'|-N)(P3+2 P2 ) for % and ^ For 
simplicity, we denote the exponential weight by 

E(k,l±,l,5) := e -(.\i±\+\i\-\k\)(PB+2 P 2) e -(s l± {^,i ± )+s l {^,i)-S k {^,k))^ 



Applying the definition of the majorant commutator (Definition 5.4), we get 
( k <£ean E(k,l±,l,5) ((\l±\ + \l\)WW w + 2(n + 2m)WWf ') . 

0T12I3). This gives (WW^) Y < 2WW$. We introduce the 



i ± +i=k 



Here we use Lemma 
notations 



(5.13) 



to obtain 



S±:= 2 E(k,l+,l„,6), 
i + +i_=k 

£>:= ^ E(k,l ± ,l>,6), 
l±+l>=k 

l±+lo=k 

C k <C 2e<r/iE± ((|Z+| + |/-|)VF 2 + (n + 2m)(VF 2 )y) 
+ efj/xS> ( (|Z±| + |Z>|)WW |J > 1 + 2(n + 2m)WPF; 



ea/xSo f(|Z±| + |fo|)WW |/o1 + 2(n + 2m) WW 



The second term in the RHS is the most complicated one. We only consider this 
term. The other two terms are done similarly. 



ea/i£> ({\l±\ + \l>\)WW\ l> \ + 2(n + 2m) WW, 



|J>| 

< ea / uS> ((2K + |£;|)WW |fc|+ ^ + 2(n + 2m)WWp l+K 
l ' Kl <^ea^>{\k\WW\ k \ +K + 2{K + n + 2m)WW^ +K ^ 

< 3Ke <T £ a/iS > f iM W l*l + WW^ . 

Here |/>| < K + |fc|, because l> = k — l±, \l±\ < K. We used Lemma 5.12[ 5) to 
decrease the exponent of W''L We also imposed a mild restriction: 



(5.15) 



2(n + 2m) < K 



If > -fT, we get the m ajoran t equation for the part in equation (5.8). 



If \k\ < K, using Lemma 5.121) and = W, we replace the last "<" in (5.14) 
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by 



< QKe^ea^WWy- 



This is the majorant equation for W in equation (5.8). 



For S± and So, we get the same majorant estimate with S> replaced by S± and 
S . 

Now the problem is to find a{5) to give bound for 6K e a eafj,(2T,± + So + S>). We 
need to do some careful analysis for this and the result is summarized in the follow- 
ing lemma. 

Lemma 5.13. We have the following upper bound for (2S± + So + S>), where 

2(2K) n .. „ . fn 



S-t, So, S> are defined in (|5.13|), 
(2S± + S + S>)(5) < 



(2K - 25/f P2 ) n - 1 
(n-1)! 



+ < 



??! 



If we define a(5) = 6Ke a £ap(2Y,± + So + S>), then 

rKtp z 

A(5*) < A(KTp 3 



2(2K) n ~ 1 T 
(n-l)\6 

a(s)ds 



*f5< 



2K' 1 
Tn 

2K' 



is equation (5.12) in Proposition 5.11 



The proof of this lemma is given in Appendix |A| 



that each integrand of equations (5.7) has majorant estimate 



This lemma gives the restriction (5.12) in Proposition 5.11 What we have shown is 



-|fc|(p3+2p2),.Afe 



r\k\ 



where satisfies equations (5.8). Combined with the majorant control on initial 



W\ k \ . Now the proof of the proposition is complete. 



condition in Lemma 5.9, this implies the LHS of equations (5.7) is maj orated by 

□ 



5.5. The system after the averaging. The continuous averaging gives us the 
following information about the Hamiltonian vector fields. 

Lemma 5.14. At time 5 = 5*, we have 

h k <C a\xe~ \ k \PW^ k \ for k £ DqU D>(0), 

h k < afie- 2p2W e- p3K W W , for k G D±(0). 
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Proof. Recall in Lemma 5.5, we set v k = h k e Sk ^^ ul * ^ . Using the definition of Sj-(S) 



in Definition 5.1 



we get Vk = hk for k G Dq\J D > (0). Then Proposition 5.11 applies 
to such fe's. For k E D±(0), we must have ps\k\ + Sfc (<5* ) (w* , fc) = p^K according to 
the definition D±(S). Then apply Proposition 5.11 to this case. 

□ 



5.5.1. The estimate of the normal form. Now we use the information that we have 
obtained to prove Theorem |3.1[ Let us define the change of variables 



5*. Then, the 



(l,e,x,y)(0) -> (I,9,x,yW) := (/' ,9' ,x' ,y') 

obtained by the continuous averaging at the stopping time 5 
Hamiltonian (3.2) in these new variables is of the form 



H'(I', 9', x', y') = Ho(l') + £*(/', 9',x', y') + e*(l', 9', x', y'), 

where ^ is the resonant term and ^ is the nonresonant term as defined in Defini- 
The following lemma gives estimates for the functions ^ and 



tion 



3.2 



Lemma 5.15. Suppose K > 2a/(5p\), K > 2(n + 2m), 5(y / re + 2\/m)lZ < a and 
A(S*) < 4cr 2 /25. Denote after the averaging, H — > H(S*) := and H — > H(6*) := 
V, then 



f>2 



09 



l->2 



< —e 
~ P2 



< 



5/i 



>P2 ~ p n 



* + $ 



< ^ 

P2 P 2 



where (I,x,y) € (Q n + 4a/ 5) x (W 2m + 4a/5) after the averaging 
Proof. Notice the I component of h k is ikH k . So Lemma 



5.14 



implies 



< ape 



iff* 



< ape 

-2 P2 \k\ e -p 3 K w ^ for k G D±(Q), 



where \P s are Fourier coefficients of ^ and ^. First we get upper bound for W. 
From Lemma 



5.10 for < (y/n + 2\fm)lZ we have 

2 



W < 



< 5/(2ff) 



cr - {y/n + 2y/m)lZ 

provided 5(y / n + 2y/m)1Z < a. (We will see in Section [7] that the confinement 
radius 1Z = o(l) as e — )• 0, so this condition is easy to satisfy.) The remaining 
width of analyticity for (I', x' , y') becomes 4cr/5. We can also replace the condition 
A{5*) <(a-(^+ 2yfm\)1Zf/4 by 

A{6 ) < 25- 
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The factor e P2 ' fc ' is absorbed in the 



\ P2 norm. For k G -D-i-(O), we are left with 
f ie -(p3+p2)K w _ For k £ D > (0), we have V^l fc l = We WA{ - 5 * )< ^ according to (5.8). 



We only need to ensure WA(8*)/K < Pl so that e WA(S*)\k\/K < ePl |fc|_ xhis ig 
cr/if < 5pi/2 because A(<J*) < 4cr 2 /25 and TV < 5/(2<r). (We will also see in Sec- 
tion [7] that iT — > 00 as e — > 0, so this condition is also easy to satisfy.) 



< afjW e 



-p 3 K 



Pi 



fc|<A" 



"P2| 



+ £«- 

|ft|>K 



(P3+P2)|fc| 



< apWe^ ( + T^^l < 2afiWe- p3K /p% < ^e~ p3K . 



Recalling the definition of K 



Pi 



in Definition 



p 3 M + KT 
< 5/x / pi 
" P2 XP V M+ftT 



5.1 



we find 



Similarly, we have the estimates for \&, \£ + \t, 



5* 
00 " 



□ 



5.5.2. The deviation of action variables in the real domain. 



Lemma 5.16. Under the same hypothesis as Lemma 5.15, after the averaging the 
total deviation of the variables is 

5epT 



(1 ,9 ,x ,y)- (1,6, x^)^ < 



27^3 + P2 y 



Here the norm | * |oo is taken in the real domain. 

Proof. For simplicity, we consider only the I component. The other components are 

dF 



similar. From equation (4.2), we have 

dl 

dJ 



{I,F} 



86' 



where the RHS is a real function. Then 



(5.16) 
We have 



dF 
~dd 



dS. 



OF 

~d0 



-e o- k {8)kH k e i( > k & 

\k\<K 

since F = zYl\k\<K i&k(fi)H k e l ( k ' 6 ' defined in Lemma 5.2 



We also have 



o k (5)kH k < ape 



W(Y,5). 



2(, 
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Hence we have the estimate (recall 2irT = T.) 
dF 



89 



(S) 



< sail e-^+P^e-^'^W < 



5e/i 



\k\<K 



(P3 + Pi) 



-S/T 



lo 



' s * dF 



r 6* 



< 



Pi 



dF 



d5 < 



P2 



(P3 + P%) 



h [P3 + P2j 



n 



□ 



Proof of Theorem 3.1, Lemma 5.15 and 5.16 complete the proof of the theorem 
Notice the conditions of the theorem coincide with that of Lemma |5.15 and 5.16 



where the last condition in the th eorem is exactly A{5*) < 4<r 2 /25. Lemma 5.15 
gives the estimate of \l/ and Lemma 5.16 gives the estimates for the deviation of the 
variables. □ 



6. Local stability: stability in a vicinity of a given periodic orbit 



In this section, we derive stability result using the normal form Theorem 3.1 Recall 

§dH — 
we have set oj* = — (0) as the frequency vector of the periodic orbit 

that we are considering. We consider initial condition 1(0) such that |/(0)|2 < r. 



Theorem 6.1. (Local stability) If the conditions of Theorem 3.1 and the following 
conditions are satisfied, 

P2 

5ne//T 

< r, 



{pi + Ps) 

then we have stability result: if the initial conditions |J(0)|2 < r, then one has 

\I(t)\ 2 <K := 8 1 — ±, for all time \t\ < T := T ^ 1 e M + llT . 

M_ 11 ~ \oj*\ 

Proof. The integrable part has the Taylor expansion around the point I'(0), 
H (l'(t)) - H (I'(0)) = (u;(l'(0)), I'(t) - I'(0)) 



f (1 - s)V 2 H (sl'(t) + (1 - s)l'(0)) ds(l'(t) - l'(0)),l'(t) - I'(0) 
Jo 



+ 



We obtain the following inequality using Definition |1.2| 
(6.1) 

1.. _,-.,2 



Af_ \r(t)-I'(p)\: < \H (l'(t))-Ho(l'(0))\ + (u(l'(0)),l'(t)-l'(0))\ . 
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We use the energy conservation and Lemma 5.15 for the first term of the RHS to 
get 



< 



92 



\H (I'(t)) - H (I'(0))\ < e (|| (* + *) (/'(0))| p2 + || (* + *) (''(*))| J 

For the second term in the RHS of inequality (6.1), we have 

\(u(l'(0)), l'(i)-/'(0)>| < \(u*, I'(t)- I'(0)>| + \(u(l'(Q)) -u:\l\t) - /'(0)>| 

For the first term in the RHS, we use the Hamiltonian equation, Lemma |5.15| and 

/ d^\ 
the fact that ( u) , ) = 0. 



(u*,l'(t)-l'(Q))\<\t\ 



(>■'. 



< T—e M + nT \uj*\, for \t\ < T. 

P2 



For the second term, 

|<u,(/'(0)) - io\I'(t) - J'(0))| < M + (|/(0)| 2 + |I'(0) - J(0)| 2 ) • \l'(t) - l'(0)| 2 



< M+ r + n 



(P3 + /> 2 )" 



5e/iT 



where we use 1 7(0) 1 2 < r and |I'(0)— 1(0)|2 < n- 

(P3 + P2) 1 

We have a factor n since we go from I • loo to | • U. 



following from Lemma 



5.16 



If we set a = \I'(t) — /'(O)^, then we get an inequality of a from (6.1): 

5£fif 

!>'■> \ (P3 + P2) n 



— a < — ^- + T — Tre M + nT \u> | + M+ [r + n 



Pi 



a. 



We choose 
(6.2) 

to obtain 

We set 
(6.3) 

Then we have 



r < 



i 



5e//T 

n-, r- < r, 



(P3+P2) n 



a < 



2M+r + ^/4M|r 2 + ^ff M_ 
Ml 



^M_ < Mir 2 . 
a = \l'(t)-l'(0)\ 2 < 5 



M+r 
M_ ' 



\I(t) - /(0)| 2 < |/(t) - /'(*)|2 + |/'(*) - /'(0)| 2 + |/'(0) - J(0)| 2 < 2r + 



5rM 4 
M_ 
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\I(t)\ 2 < |/(0)| 2 + \I(t) - /(0)| 2 < 3r + 5 — ± < 5i ' 4 



The proof is now complete. □ 



Remark 6.1. We introduce a restriction 6.3 instead of introducing a constant g as 



did in |LNN| IN] . The two restrictions of this theorem implies P2,P3 can be suffi- 
ciently small if e is. Then pi can be very close to p. The restrictions of Theorem \3.1\ 
are also satisfied for e small enough. Then we get improved stability time compared 
with [LNNl IN]. 



7. Global stability: stability for arbitrary initial data 

In this section, we consider stability result for arbitrary initial data and give a proof 
of Theorem 2.1 We first prove the following lemma. 

Lemma 7.1. Let us fix (1 <)Q G M and assume — tt? < ^— ; , — , 

V ' QV(n-l) 4sup Je5 n |V 3 #o|oc 

then for any I G Q n , there exists an integer q, 1 < q < Q, and a point I* G Q n such 

that | J — J* 1 2 < = — . andu>(I*) is rational vector of period T = q/loj^I)^. 

Proof. First recall the Dirichlet theorem for simultaneous approximation: 

for any a G W n , Q G M, and Q > 1, £/iere exists an integer q, 1 < q < Q, such that 

|ga-Z n |oo < Q- 1 /". 

An improvement of the estimate can be obtained by rescaling a to a/|a|oo- Then 
apply the Dirichlet theorem to approximate the remaining n — 1 components of a 
with one of whose ±1 entries removed. We get the following: 

there exists a rational vector a* of period T = r-f— . oGN, 1 < q < Q and 

Woo 

1 

\ot* — a|oo < j<Qi/(n-i) ( see ^ e on ^y) Proposition in |N|). 

The frequency vector is = VHq(I). Consider two points I* and / such that 
oj(I*) is as stated in the lemma and approximates co(I) in the same way as a* 
approximates a. 



Hence from Definition 11.21 



f Vuj(tI + (l-t)I*)dt(I-P). 
Jo 



M-\i-r\l < 



[ Vu(tl + (1 - t)I*)dt{I — I*), (I — I* 
Jo 

((I-I*),U(I)-LJ(P)) < \I - I*\ 2 \U(I) - L0(I*)\ 2 . 
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This implies 

M_\I-I*\ 2 < Kl)-w(I*)| 2 < y/n=l\u(I) 



< 



v^l>(I)| c 
gQV(n-i) 



In order to make sure the point I* can be found given /, we need to show the fre- 

/ Ml 

quency map can be inverted, which can be done within the ball B I lo(I*), 



4|V 3 i/ |c 



centered at with radius 



Ml 



So we assume 



n 



1 



< 



4|V 3 F | 

Ml 



Ql/(n-l) 4|V 3 F 



using implicit function theorem (see [LNNJ ) . 

□ 



Proof of Theorem 2. 1 , We define r 



^7} T) and 71 = 8r**± = 8 ^EESM± 



as in Lemma 



7.1 



If we set Q = e~ 



then we have 



n 



8\/n£TM+ ffl/2n < 8y / ^TM + ^ 1/2n 



mIt 



Mi 



VH \ 



KT 



8 v / n Tr TM 4 



r l/2n 



The stability time in Theorem 6.1 
1 



is 



r 



exp 



Pi 



> 



M + KT J ~ sup /e gn \VH 



exp 



8 v / n Tr lM: 



-£ 2n 



Now let us analyze the restrictions that we have. The restrictions are from Theo- 
Theorem 6.1 and Lemma 7.1 The quantities T, r, 1Z, K satisfy the following: 

Pi Ml 



rem 



3.1 



1< TMoo < Q =£— 5T, K 



Pi 



,1/2 



£ 1/2 |o;|oo 



M_ 

8v / ^lM 4 
Ml 



< r < e 



-£ 



-l/2n 



l/2n. 



<K<e 



l/2n 



1M+ 



Ml 



We substitute the bounds for T,r,lZ, K into the restrictions that we have to obtain 
the restriction of Theorem 12. II The first restriction in Theorem 12. II is from the first 
one of Theorem 16.11 The second in Theorem 12.11 is a collection of the first two ones 
of Theorem |3.1l the second of Theorem 6.1 and that of Lemma 7.1 The last in 

e of Theorem 3.1 We break it into two inequalities 

< 1. □ 



Theorem 2.1 



by setting — — ( 1 + In 



n 
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Remark 7.1. The first restriction in Theorem 2.1 can be satisfied by making \x 



smaller while e larger. This will lead to shorter stability time. The fi here plays the 
same role of the factor g in |LNNl |N] . The second restriction can always be satisfied 
by making e small. The third restriction can be satisfied by making ji or pi/ pz small. 
However, since n\ grows very fast, for large n, this restriction is easy to satisfy. 



Appendix A. Proof of Lemma [5.131 



The proof is done in the following Claim 1,2,3, which estimates S = |-,S > ,So in 
Lemma 5.13 respectively. Before the proof of the lemma, let us first analyze the 
geometry of numbers involved. 



A.l. The geometry of integer vectors. Let us look at the Figure 2. 




Figure 2. counting the number of combinations. 



• The diamond: the diamond in the figure encloses all the vectors k with 
\k\ < K (in 3-dim it is an octahedron. In general it is a ball of radius K 
under the I 1 norm). The total number of integer vectors inclosed in the 
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dim volume of HPq D Diamond is less than — —rr , which is the (n — 1 



diamond is — . Indeed, in ra-dim, the diamond consists of 2 n simplices. 

n\ 

K n 

Each of the simplices has volume — - . 

The hyperplane: the small arrow indicates the rational frequency uj*. HPo 
is a hyperplane that is perpendicular to uj* . HPo = Dq and the (n — 1)- 

(2K) n ~ 1 

(n- 1)! 

dim volume of an (n — l)-dim Diamond. Any vector lies above HPo has 
positive inner product with uj* , while any vector below has negative inner 
products. Moreover, if two vectors lie on the same hyperplane which is 
parallel to HPo, they will have the same inner product with uj* . Let us 
denote HP^ = {k £ Z n \(k,uj*) = d/T}. HPo n Diamond contains at most 
(2K) n ~ 1 

— integer points. 

(n — 1)! 

• The parallelogram: consider the vectors l+,l-,k in the Figure 2. Suppose 
we have the relation l + + Z_ = k. Then the three vectors together with the 
origin form a parallelogram. Suppose (k,uj*)T = 1, and (1+uj*)T = 2, then 
{l-,cj*)T = — 1. Z+ and Z_ can move on their corresponding hyperplane, but 
a parallelogram is always preserved. 

• The shape of the diamond under the averaging flow: in the definition of 
D±(5), we have the restriction \l±\p3 + \{1±,uj*)\5 < p%K for l± G D±(5). 
When 5 = 0, this is our diamond. When 5 increases, The diamond will 
collapse, i.e. the integer vectors becomes fewer on HP^. 

(A.l) \l±\<K--\(l ± ,co*}\. 

The rate of decreasing depends on the inner product | (l± , oo*) \. The farther a 
hyperplane HP^ is away from HPo (The larger the d), the faster it collapses 
(with volume decreasing rate d/(p%T)). HPo does not change at all. When 
5 = 5*, the diamond would collapse to its intersection with HPo- By then 
we would have successfully killed all the nonresonant terms up to the desired 
exponential smallness e~ p2 ^~ psK . We denote the collapsed diamond at time 
5 by Diamond(J). 



A. 2. Estimate of E±,E>,So and the proof of Lemma 5.13 Now we obtain 
estimates of £±, £>, Sq for fixed k. 



A. 2.1. Claim 1: The sum E±(<5) defined in equation (5.13) can be estimated as fol- 
lows: 



£±(*) < { 



(2K) n - 1 T .r? > Tn 
2{n-\)\5 1 ~ 2K' 
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Proof. In the proof, the vector k is fixed. The £± is defined in equation (5.13), 

which can be estimated as 

E ± (5) < ^ e -(5 ;+ K,« + >+s L (u;*,i_>-s fe K,fc» ; 

where we have dropped e~^ 3+2p2 ^' i+ ' + ''-'~' fc '), since 

Z + + 1_ = k |Z+| + |Z_| > e -(Ps+2»)(|i+|+[t-|-[fc|) < L 

From the relation (u*,l+) + (w*,Z_) = (u*,k), and the inequality 5 > |S*fc((5)|, we 
have the following two cases depending on the sign of (w* , k) . 
If (oJ*,k) > 0, then 

S l+ {u*,l+) + (oj*, Z_) - ,S fe (u;*, fc) > 5((u*, k) - (w*. Z_)) - - S fc (uA fc) 

> (S-S k ){u*,k) -2£(w*,Z_) > 2(S|(wV_)|. 

If (uj*,k) < 0, then 

S i+ <w*, Z+> + ^_ (a/, Z_) - fc) > 8{u>*, 1+) - 5(S k (co*, k) - - S k (co*, k) 

> (-5-S k ){uj*,k) + 25{uj*,l + ) >25\{u*,l + )\. 

Moreover, when 5 = 0, the number of integer vectors contained in HP^nDiamond(O) 

is no greater than — . It is zero when \d\ > K. Since k is fixed, we can vary 

(n — 1)! 

either Z+ or Z_. The other one will be determined uniquely. According to the 
analysis above, we sum over Z+ if (a;*, k) > while over Z_ otherwise. We consider 
the Z+ case for instance. As 5 increases, on each HP^ the number of integer vectors 

(oj^ 2cZ<5 IT p^\ n ~^ 

contained in HP^D Diamond (<5) is no greater than — according to 



inequality ( A.l ). It is zero when \d\ > K or K < d5/Tp 3 . Now we have the estimate 



S±W -^ (n^l)! 6 'Jo (^1)! 6 d(d) 



d 

mr- 1 r .-wit JfJ , ^ (.m^f 



< sp> / e -2ds /T d{d) < 



(n- 1)! Jo 2(n- 1)W 



This estimate is poor when <5 is close to zero. But in fact when 5 = 0, the upper 
(2K) n 

bound is £±(0) < — , and the upper bound is monotonically decreasing w.r.t. 

2n! 

8. So we can use the bound stated in Claim 1. See figure 3. □ 
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Figure 3. upper bound of a(S). 



A. 2. 2. Claim 2: The sum X>(<5) defined in equation (5.13) can be estimated as fol- 
lows: 

n , */ o < — , 

< { (2K) n - 1 f „ p Tn 



(n-l)W 



2K 



Proof. It is denned in equation (5.13) that 

£>(£) = V e _(|z±l+l ' >l_|fc|)(/33+2p2) e _ (' Si ± <w *' Z±)+ ' Si > <w *' Z>)_ ' Sfc<a; *' fe) ) 



z±+i>=fc 



We first show 



-Sj + (w*,i±) 



e>(*)< E e ~ 5i± 

i±+Z>=fc 

According to the definition of /> € D>(6), the Fourier term corresponding to /> is 
of the size 

-Si > (uj*,l > )-\l > \p 3 < p-p^K 



But e s k( ul *' k ) P3\k\ > e due ^ Definition 5.1 The equality is achieved only 
when k G D > (5). So we know 

g -5 ;> (w*,« > )-|« > |p3+S'fc(w*,fc}+p3|fc| <- ^ 

We drop the e-l^ 3 and e -(\i±\+\hHk\)(2 P2 ) to obtain E> ^ < ^ e - s '±("*> l ±) . 

This is essentially the same as the Case £± discussed above. So we have the bound 
stated in Claim 2. 

□ 
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A. 2. 3. Claim 3: The sum £o(<5) defined in equation (5.13) can be estimated as fol- 
lows: 

(2K-2\(u;*,k}\S/p 3 r- 1 (2K-25/Tp 3 r- 1 
^° {d) < (^T)! " ^Ijl • 

Proof. The term £q(<5) turns out to be the most troublesome term. Again equa- 



tion (5.13) tells us 

E (£) = Yl e" (|i±l+l ' ol " |fc|)(p3+2p2) e~( 5i ± (aJ *' /±>+5i o <w *'' o) ~ 5fc<w *' fc) ) 
l±+la=k 

Because of the relation (a?*, 1±) + (uj* , lo) = (u)*, k) and (to* , lo) = 0, we get (uj*, l±) = 
(tu*,k) and 

S l± (6)(u;*,l±)>S k (S)(uj*,k). 

The reason is, we know that l± E D±(5) until time 5, but we do not know where k 
is. This implies \S l± (5)\ >\S k (S)\. The "=" is achieved only if k G D-(6) U D+{5). 
So we get 

S (^) < V e ~ (l ' ±l+l ' o| ~ |fe|)(|03+2/:,2) < V 1. 

l±+lo=k l±+lo=k 

Now consider Figure 2. Since Iq € HPo, we get l± and k must lie on the same 
hyperplane. So Yli±+l =k ^ s bounded by the number of the possible l±'s, which is 

(2K-2\(oj*,k)\5/ P3 ) n - 1 
(n-l)\ 

This gives the Claim 3. 

□ 



Proof of Lemma 5.13, We simply add up the upper bounds for So,S>,S± to get 



an upper bound for 2£± + £> + Sq. This proves Lemma 5.13 □ 



Appendix B. Elements on majorant estimates 

In this appendix, we collect some basics about the majorant relation. The materials 
can be found in the Chapter 5 of |TZj . The majorant relation " <C " is defined in 
Definition |4~T1 



Lemma B.l. The relation "<^." satisfies the following properties: 

(1) If fi < 9i and f 2 < g 2 , then f x + / 2 < gi + g 2 and fif 2 < 5i92- 

5/ dg 

(2) // / < g, then — <C ^— for any j = 1, • • • , m. 

OZj OZj 
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(3) // f(z, A) <C g(z, A) for any value of the parameter A S [a, b], then 

fb r-b 

f(z,X) d\<^ g(z,X) dX 

J a 

(4) Let \f(z)\ < c in the domain {z = (#!,••• , z m ) : \z\ < b,j = 1, , m}. 

T/ien /(z) <C c/w <C ^ , where w = b m (b — z\) ■ ■ ■ (b — z m ) and z = 

z\ + Z2 H h z m . 

Moreover, the majorant relation is also preserved by solving differential equations 
or integral equations. 

Definition B.l. Consider an ODE system 
(B.l) f*(z,5) = F k (f(z,5),z,5), f k (z,0) = f k (z) 

with some known functional F k and initial data f k . We call a system 
(B.2) f|(z, 5) = F k (f(z, 6),z, 5), i k (z, 0) = f k (z) 



a majorant system associated with equation (B.l) if 

(a) f k {z) <C t k (z) for any fceZ, and 

(b) F k (g(z), z, 5) <C F fc (g(z), z, 5) for any k G Z, <5 > 0, and 5, g suc/i that g -C g. 



We have the theorem 



Theorem B.l (Chapter 5 of |TZ] ). If f(z,5),0 < 5 < 5o is a solution of the 
majorant system (B.2) associated with (B.l), then the system (B.l) has a solution 
and f k (z,5) <C f k (z,5) for any 5 G [0, 5q], fcgZ. The same is true if we rewrite 
systems (B.l) (B.2) in the integral form: 



(B.3) 



f k (z,5) 
i k (z,5)- 



f k (z)+ / F k (f(z,s),z,s) ds, 



+ / F k (f{z,s),z,s) ds, 



With this theorem, we treat d as a parameter instead of a variable. So we do not 
need to do the Taylor expansion w.r.t. 5. 
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